Deserialization service

ABSTRACT

Approaches described relate to the management of messages in an electronic environment. In particular, various approaches provide for analyzing messages of different message types to efficiently process those messages in a service environment, such as a multi-tenant environment. The messages can include one or more message fields, allowing for a plurality of different message types. The messages can be analyzed to identify known message types, and processing of messages of the same type can be expedited, e.g., by more quickly deserializing that message using cached message offset information associated with that message type. For example, a message that includes message data and an identifier can be received. The identifier can be matched to an entry associated with the identifier and message offset information. The message offset information can be utilized to determine positions in the received message that are associated with message data and the message data can be obtained without having to build up a reference structure around the structure of the message. Thereafter, a deserialized message can be generated using the message data retrieved using the message offset information.

CROSS REFERENCE TO RELATED APPLICATION

This application is a Continuation of, and accordingly claims thebenefit of, allowed U.S. patent application Ser. No. 15/273,272, filedwith the U.S. Patent and Trademark Office on Sep. 22, 2016, which ishereby incorporated herein by reference.

BACKGROUND

Users are increasingly performing tasks using remote computingresources, often referred to as part of “the cloud.” This has manyadvantages, as users do not have to purchase and maintain dedicatedhardware and software, and instead can pay for only those resources thatare needed at any given time, where those resources typically will bemanaged by a resource provider. End users can communicate with thoseresources, process data with those resources, among other such tasks viamessages in various formats. It can be expensive, however, to initializea service to read those messages. It can be even more expensive todetermine and manage messages of different types, as well as variationsto those messages of various types.

BRIEF DESCRIPTION OF THE DRAWINGS

Various embodiments in accordance with the present disclosure will bedescribed with reference to the drawings, in which:

FIG. 1 illustrates an example environment in which various embodimentscan be implemented.

FIG. 2 illustrates an example subsystem for deserializing messages thatcan be utilized in accordance with various embodiments.

FIG. 3 illustrates an example message in accordance with variousembodiments.

FIG. 4 illustrates an example message that can be utilized in accordancewith various embodiments.

FIG. 5 illustrates an example process for deserializing messages thatcan be utilized in accordance with various embodiments.

FIG. 6 illustrates another example process for deserializing messagesthat can be utilized in accordance with various embodiments.

FIG. 7 illustrates example components of a computing device that can beused to implement aspects of various embodiments.

DETAILED DESCRIPTION

In the following description, various embodiments will be described. Forpurposes of explanation, specific configurations and details are setforth in order to provide a thorough understanding of the embodiments.However, it will also be apparent to one skilled in the art that theembodiments may be practiced without the specific details. Furthermore,well-known features may be omitted or simplified in order not to obscurethe embodiment being described.

Approaches described relate to the management of messages in anelectronic environment. In particular, various approaches provide foranalyzing different types of messages to efficiently process thosemessages in a service environment, such as a multi-tenant environment.The messages can include one or more message fields, allowing for aplurality of different message types. The messages can be analyzed toidentify known message types, and processing of messages of the sametype can be expedited, e.g., by more quickly deserializing that messageusing cached message offset information associated with that messagetype. For example, a message that includes message data and anidentifier can be received. The identifier can be matched to an entryassociated with the identifier and message offset information. Themessage offset information can be utilized to determine positions in thereceived message that are associated with message data and the messagedata can be obtained without having to build up a reference structurearound the structure of the message. Thereafter, a deserialized messagecan be generated using the message data retrieved using the messageoffset information.

Embodiments provide a variety of advantages and are applicable in anumber of situations. For example, in accordance with variousembodiments, such approaches can be utilized by any resource thatprovides and/or receives messages. In another such application,approaches can be utilized for calls between services. In accordancewith various embodiments, by providing better message processing, thesystem can more efficiently and quickly process messages. Accordingly,fewer interactions are necessary for processing messages. As such, fewerresources of the system are necessary to execute deserializing and othersuch processes. Various other such functions can be used as well withinthe scope of the various embodiments as would be apparent to one ofordinary skill in the art in light of the teachings and suggestionscontained herein.

FIG. 1 illustrates an example environment 100 in which aspects of thevarious embodiments can be implemented. In this example a user is ableto utilize a client device 102 to submit requests across at least onenetwork 104 to a resource provider environment 106. The client devicecan include any appropriate electronic device operable to send andreceive requests, messages, or other such information over anappropriate network and convey information back to a user of the device.Examples of such client devices include personal computers, tabletcomputers, smart phones, notebook computers, and the like.

In accordance with various embodiments, a message can include one ormore message fields. Each message field can include message data. Amessage type can be based on the message fields included in a messageand an order of those message fields in the message. In an embodiment,messages of the same type include the same message fields in the sameorder. In various embodiments, messages of the same type can beindicated by an identifier included in the message or provided alongwith the message, where messages of the same type include matchingidentifiers. As described herein, such an indicator is not necessary todetermine messages of the same type. In various embodiments, messagescan be analyzed to identity the message fields and the order of themessage fields in the message, where messages that include the samemessage fields in the same order can be of the same message type. Anexample message includes a java object notation (JSON) message. JSON isan open-standard format that uses human-readable text to transmit dataobjects consisting of attribute—value pairs. For purposes ofillustration only, semi-structured data is represented in JSON format.Other self-describing, semi-structured formats can be used according tothe principles of the present disclosure. Source data does not need tobe self-describing. The description can be separated from the data, aswould be the case with something like protocol buffers. As long as thereare rules, heuristics, or wrapper functions to apply tags to the data,any input data can be turned into objects similar to a JSON format.

In accordance with various embodiments, a name—value pair, key—valuepair, field—value pair or attribute—value pair is a fundamental datarepresentation in computing systems and applications. In suchsituations, all or part of the data model may be expressed as acollection of tuples <attribute name, value>; each element is anattribute—value pair. Depending on the particular application and theimplementation chosen by programmers, attribute names may or may not beunique. In the example of a JSON message, JSON syntax specifies thatmessage data is in name/value pairs, where message data is separated bycommas, curly braces hold objects, and square brackets hold arrays. Aname/value pair consists of a field name (in double quotes), followed bya colon, followed by a value. JSON values can be a number (integer orfloating point), a string (in double quotes), a Boolean (true or false),an array (in square brackets), an object (in curly braces), null. JSONobjects are written inside curly braces. Just like JavaScript, JSONobjects can contain multiple name/values pairs. JSON is the formatthat's very dynamic and self-descriptive so each field is mapped to avalue and it's very explicit as far as what those fields and values are.And the ordering is non-determinant.

As described, other self-describing, semi-structured formats can be usedaccording to the principles of the present disclosure. Such examplesinclude a protobuf message, a messagpack message, or a concise binaryobject representation (CBOR) message, among others. At least one otherexample includes Amazon Ion, a richly-typed, self-describing,hierarchical data serialization format offering interchangeable binaryand text representations. The text format (a superset of JSON) is easyto read and author, supporting rapid prototyping. Applications canconsume Ion data in either its text or binary forms without loss of datafidelity. Ion's type system is a superset of JSON's: in addition tostrings, Booleans, arrays (lists), objects (structs), and nulls, Ionadds support for arbitrary-precision timestamps, embedded binary values(blobs and clobs), and symbolic expressions. Ion also expands JSON'snumber specification by defining distinct types for arbitrary-sizeintegers, IEEE-754 binary floating point numbers, and infinite-precisiondecimals. In binary Ion, common text tokens such as struct field namesare automatically stored in a symbol table. This allows these tokens tobe efficiently encoded as table offsets instead of repeated copies ofthe same text. As a further space optimization, symbol tables can bepre-shared between producer and consumer so that only the table name andversion are included in the payload, eliminating the overhead involvedwith repeatedly defining the same symbols across multiple pieces of Iondata. A symbol table can include one or more symbols. Symbols are muchlike strings, in that they are Unicode character sequences. The primarydifference is the intended semantics: symbols represent semanticidentifiers as opposed to textual literal values. In the text format,symbols can be delimited by single-quotes and use the same escapecharacters. A subset of symbols are called identifiers and can bedenoted in text without single-quotes. An identifier can be a sequenceof ASCII letters, digits, or the characters $ (dollar sign) or_(underscore), not starting with a digit.

The at least one network 104 can include any appropriate network,including an intranet, the Internet, a cellular network, a local areanetwork (LAN), or any other such network or combination, andcommunication over the network can be enabled via wired and/or wirelessconnections. The resource provider environment 106 can include anyappropriate components for receiving requests and returning informationor performing actions in response to those requests. As an example, theprovider environment might include Web servers and/or applicationservers for receiving and processing requests, then returning data, Webpages, video, audio, or other such content or information in response tothe request.

In various embodiments, the provider environment may include varioustypes of electronic resources that can be utilized by multiple users fora variety of different purposes. In at least some embodiments, all or aportion of a given resource or set of resources might be allocated to aparticular user or allocated for a particular task, for at least adetermined period of time. The sharing of these multi-tenant resourcesfrom a provider environment is often referred to as resource sharing,Web services, or “cloud computing,” among other such terms and dependingupon the specific environment and/or implementation. In this example theprovider environment includes a plurality of electronic resources 114 ofone or more types. These types can include, for example, applicationservers operable to process instructions provided by a user or databaseservers operable to process data stored in one or more data stores 116in response to a user request. As known for such purposes, the user canalso reserve at least a portion of the data storage in a given datastore. Methods for enabling a user to reserve various resources andresource instances are well known in the art, such that detaileddescription of the entire process, and explanation of all possiblecomponents, will not be discussed in detail herein.

In at least some embodiments, a user wanting to utilize a portion of theresources 114 can submit a request that is received to an interfacelayer 108 of the provider environment 106. The interface layer caninclude application programming interfaces (APIs) or other exposedinterfaces enabling a user to submit requests to the providerenvironment. The interface layer 108 in this example can also includeother components as well, such as at least one Web server, routingcomponents, load balancers, and the like. When a request to provision aresource is received to the interface layer 108, information for therequest can be directed to a resource manager 110 or other such system,service, or component configured to manage user accounts andinformation, resource provisioning and usage, and other such aspects. Aresource manager 110 receiving the request can perform tasks such as toauthenticate an identity of the user submitting the request, as well asto determine whether that user has an existing account with the resourceprovider, where the account data may be stored in at least one datastore 112 in the provider environment. A user can provide any of varioustypes of credentials in order to authenticate an identity of the user tothe provider. These credentials can include, for example, a username andpassword pair, biometric data, a digital signature, or other suchinformation. These credentials can be provided by, or obtained from, anumber of different entities, such as an certificate authority, a keymanagement service, a corporate entity, an identify broker such as aSAML provider, and the like. In some embodiments, a user can provideinformation useful in obtaining the credentials, such as user identity,account information, password, user-specific cryptographic key, customernumber, and the like. The identity provider can provide the credentialsto the resource provider environment 106 and/or to the client device102, whereby the client device can utilize those credentials to obtainaccess or use of various resources in the provider environment, wherethe type and/or scope of access can depend upon factors such as a typeof user, a type of user account, a role associated with the credentials,or a policy associated with the user and/or credentials, among othersuch factors. In some embodiments the resources or operators within theenvironment can obtain credentials useful in signing commands orrequests for various purposes as discussed and suggested herein.Although illustrated outside the resource provider environment, itshould be understood that the certificate authority could be a serviceoffered from within the resource provider environment, among other suchoptions.

The resource provider can validate this information against informationstored for the user. If the user has an account with the appropriatepermissions, status, etc., the resource manager can determine whetherthere are adequate resources available to suit the user's request, andif so can provision the resources or otherwise grant access to thecorresponding portion of those resources for use by the user for anamount specified by the request. This amount can include, for example,capacity to process a single request or perform a single task, aspecified period of time, or a recurring/renewable period, among othersuch values. If the user does not have a valid account with theprovider, the user account does not enable access to the type ofresources specified in the request, or another such reason is preventingthe user from obtaining access to such resources, a communication can besent to the user to enable the user to create or modify an account, orchange the resources specified in the request, among other such options.

Once the user is authenticated, the account verified, and the resourcesallocated, the user can utilize the allocated resource(s) for thespecified capacity, amount of data transfer, period of time, or othersuch value. In at least some embodiments, a user might provide a sessiontoken or other such credentials with subsequent requests in order toenable those requests to be processed on that user session. The user canreceive a resource identifier, specific address, or other suchinformation that can enable the client device 102 to communicate with anallocated resource without having to communicate with the resourcemanager 110, at least until such time as a relevant aspect of the useraccount changes, the user is no longer granted access to the resource,or another such aspect changes. The same or a different authenticationmethod may be used for other tasks, such as for the use of cryptographickeys. In some embodiments a key management system or service can be usedto authenticate users and manage keys on behalf of those users. A keyand/or certificate management service can maintain an inventory of allkeys certificates issued as well as the user to which they were issued.Some regulations require stringent security and management ofcryptographic keys which must be subject to audit or other such review.For cryptographic key pairs where both public and private verificationparameters are generated, a user may be granted access to a public keywhile private keys are kept secure within the management service. A keymanagement service can manage various security aspects, as may includeauthentication of users, generation of the keys, secure key exchange,and key management, among other such tasks.

The resource manager 110 (or another such system or service) in thisexample can also function as a virtual layer of hardware and softwarecomponents that handles control functions in addition to managementactions, as may include provisioning, scaling, replication, etc. Theresource manager can utilize dedicated APIs in the interface layer 108,where each API can be provided to receive requests for at least onespecific action to be performed with respect to the data environment,such as to provision, scale, clone, or hibernate an instance. Uponreceiving a request to one of the APIs, a Web services portion of theinterface layer can parse or otherwise analyze the request to determinethe steps or actions needed to act on or process the call. For example,a Web service call might be received that includes a request to create adata repository.

An interface layer 108 in at least one embodiment includes a scalableset of customer-facing servers that can provide the various APIs andreturn the appropriate responses based on the API specifications. Theinterface layer also can include at least one API service layer that inone embodiment consists of stateless, replicated servers which processthe externally-facing customer APIs. The interface layer can beresponsible for Web service front end features such as authenticatingcustomers based on credentials, authorizing the customer, throttlingcustomer requests to the API servers, validating user input, andmarshalling or unmarshalling requests and responses. The API layer alsocan be responsible for reading and writing database configuration datato/from the administration data store, in response to the API calls. Inmany embodiments, the Web services layer and/or API service layer willbe the only externally visible component, or the only component that isvisible to, and accessible by, customers of the control service. Theservers of the Web services layer can be stateless and scaledhorizontally as known in the art. API servers, as well as the persistentdata store, can be spread across multiple data centers in a region, forexample, such that the servers are resilient to single data centerfailures.

As mentioned, a customer of such a multi-tenant environment might send anumber of messages of different message types. The messages can includemessage fields, e.g., keys, that include respective message data, e.g.,values, a symbol table that describes the message fields included in themessage, and an identifier that can be used to identifier messages ofthe same message type. Using conventional approaches, it can beexpensive in terms of time and processing power to analyze each messageto extract relevant information from the message. For example, in aconventional approach, when a message is received, the message isanalyzed to identify the keys and associated values for those key in themessage. This process is repeated for each message received.Accordingly, approaches presented herein provide a framework thatidentifies successive message of the same type, determines a protocolfor those messages, and uses information about the protocol todeserialize messages of the same type without building up a referencestructure around the structure of the message.

In one example, a message identifier is not present in the message. Inthis example, when the message is received, the protocol for thatmessage type can be determined, and the protocol can be used todeserialize successive messages of that type. For example, the messagecan be received and the message can be analyzed to determine thelocation of the message fields included in the message, and what thosemessage fields are. The message field data associated with those messagefields can be recorded in a table and message offset information can becached. The message offset information can be used to determine aposition in the table and/or message that includes the message fielddata for a particular message. When a subsequent message of the sametype is received, the message offset information is used to read themessage field data from the subsequent message, and the message fielddata can be used to deserialize the subsequent message. Accordingly, thesubsequent message does not have to be parsed to identify the messagefields and the associated message field data for those message fields.Instead, because the message offset information has already beendetermined for messages of the same type, the message offset informationcan be used to extract the appropriate information from the subsequentmessage.

In another example, a message identifier is present in the message. Inthis example, when the message is received, the identifier can be usedto obtain (if available) or determine (if not available) message offsetinformation. As described, message offset information can be utilized todetermine positions in the received message that are associated withmessage data and the message data can be obtained without having tobuild up a reference structure around the structure of the message. Theidentifier can be any appropriate identifier. An example identifier canbe a hash value. The identifier and/or message offset information can bedetermined by any number of resources in the environment. This caninclude, e.g., client devices, a message service, etc. When the messageis received, the identifier associated with the message can be comparedto list of identifiers. In the situation where the identifier is notincluded in the list of identifiers, an entry that includes theidentifier, the message field data associated with those message fields,and message offset information can be recorded in one or more tables.When a subsequent message is received that includes the same identifier,the message offset information associated with the stored identifier canbe used to read the message field data from the subsequent messageand/or a table that includes the message data of the subsequent messagewithout having to parse the subsequent message to identify the messagefields to extract the associated message field data.

FIG. 2 illustrates an example framework implementation 200 that can beutilized in accordance with various embodiments. In this example, amessage service 202 includes a search module 204, a processing module206, and a deserialization module 208, although additional oralternative components and elements can be used in such a system inaccordance with the various embodiments. Accordingly, it should be notedthat additional services, providers, and/or components can be includedin such a system, and although some of the services, providers,components, etc. are illustrated as being separate entities and/orcomponents, the illustrated arrangement is provided as an examplearrangement and other arrangements as known to one skilled in the artare contemplated by the embodiments described herein.

In this example, the message service can communicate with a set ofclient devices 210, 212 located in a publicly-accessible (or at leastcustomer-accessible) resource environment. The message service 202 canreceive messages from the client devices and deserialize the receivedmessages. In this example, a message can be digested by a client device,and the client device can generate an identifier based on the messagefields (e.g., keys) in the message, the message field type (e.g., keytype of the keys), and the message fields' (e.g., keys') position in themessage. In certain embodiments, message offset information can bedetermined at the time the identifier is generated. In otherembodiments, the message offset information can be determined at themessage service 202, as will be described herein. Message offsetinformation can be used to read message field data (e.g., values) fromthe message at specific locations in the message

In accordance with various embodiments, the client devices may operatean identifier service or at least be in communication with a servicethat can generate an identifier for each message. As described, theidentifier can be used to identifier messages of the same message type.The identifier can be any appropriate identifier. An example identifieris a hash value. A hash value is a numeric value of a fixed length thatuniquely identifies data. A hash function can be used to generate thehash value. As will be described further herein, the hash values can beused in a hash table or database to lookup message offset information(e.g., byte offset values) for messages of the same type (i.e., messagesassociated with the same hash value.) A hash table (hash map), generallyspeaking, includes a data structure used to implement an associativearray, a structure that can map keys to values. A hash table may use ahash function to compute an index into an array of buckets or slots,from which the desired value can be found.

The message, identifier, and message offset information can be providedto the message service 202. For example, the message can be received ata network interface layer 214 of the message service. The networkinterface layer can include any appropriate components known or used toreceive requests from across a network, such as may include one or moreapplication programming interfaces (APIs) or other such interfaces forreceiving such requests. The network interface layer 214 might be ownedand operated by the provider, or leveraged by the provider as part of ashared resource or “cloud” offering. The network interface layer canreceive and analyze the messages, and cause at least a portion of theinformation in the message to be directed to an appropriate system orservice, such as search module 204 or processing module 206. Forexample, the interface can provide the message, message identifier (ifavailable), and message offset information (if available) to searchmodule 204.

In the situation where an identifier is available, search module 204 isoperable to search a table of identifiers to determine whether there isa matching identifier to the received identifier. In the situation wherethere is not matching identifier, the search module can generate anentry in a table of identifiers in data store 216. The entry can includethe identifier, message fields (e.g., keys), and associated messagefield data (e.g., key values.) In the situation where message offsetinformation is not provided, message offset information can bedetermined and stored in offset data store 218. When a subsequentmessage is received that includes the same identifier, the search modulecan retrieve the message offset information associated with theidentifier. Deserialization module 208 can use the message offsetinformation to locate and read specific message portions of the message,and use the information to deserialize the message. In accordance withvarious embodiments, deserializing a message includes obtaining messagedata from a table, a file, an incoming network socket, etc. andreconstructing an object model using the message data. Serialization isthe process of turning an object into a series of bytes for transferringor storing. Accordingly, the deserialization module obtains message datae.g., bytes, corresponding to message, and generates an object using themessage data.

In certain embodiments, messages may not include an identifier. In suchsituations, processing module 206 is operable to determine where themessage fields (e.g., keys) and associated message field data (e.g.,values) are positioned in the message, and what those keys are and howthey related to the overall system. In this approach, informationassociated with the keys and their associated values can be recorded ina table. For example, bytes corresponding to the information can becopied from the message into, for example, a string table, where theinformation can be cached according to a set of caching rules. Inaccordance with an embodiment, the set of caching rules can be used toclean up the table. For example, a number of entries in the table can bebased at least in part on one of a frequency of receiving a particularidentifier, an amount of memory to maintain the table, an amount ofprocessing power to maintain the table, a number of entries in thetable, an amount of time to search for an entry in the table, animportance indicator associated with entries in the table, among othersuch criteria. In one example, cleaning up the table can includedetermining an amount of memory to maintain the table, determining anamount of processing power to maintain the table, and removing entriesin the table based at least in part on the amount of memory and theamount of processing power. It should be noted that various approachescan be implemented to clean up the table and the example provided is onesuch example.

Identifying the keys can include parsing the message for keys anddetermining byte and/or message offsets information to read those keysin the message or as copied in string table. The keys can be copied intostrings and recorded in the string table and the message offsets can bestored in offset data store 218. It should be noted that data store 216and 218 can be the same data store or as shown different data stores.When a subsequent message of the same type is received, message offsetsof that type of message can be used to determine the location of the keyin the message, and the value associated with the key can be read. Themessage does not have to be analyzed to determine message offsets todetermine where to read the keys in the message. Instead, the messageoffsets have already been determined and can be used to extract theappropriate information. As described, an identifier can be used todetermine messages of the same type. Messages of the same type caninclude messages with the same keys where the keys are in the sameorder. In the situation where an identifier indicating such informationis not available, the message can be analyzed to determine the keys inthe message and the order of the keys in the message.

In accordance with various embodiments, the message service may beperformed by any number of server computing devices, desktop computingdevices, mainframe computers, and the like. Each individual device mayimplement one of the modules of the message service. In someembodiments, the message service can include several devices physicallyor logically grouped together to implement one of the modules orcomponents of the message service. For example, message service 202 caninclude various modules and components combined on a single device,multiple instances of a single module or component, etc. In onespecific, non-limiting embodiment, search module 204 and processingmodule 206 can execute on one device and deserialization module 208 canexecute on another device. In another embodiment, the message servicecan execute on the same device.

In some embodiments, the features and services provided by the messageservice may be implemented as web services consumable via acommunication network. In further embodiments, the message service isprovided by one more virtual machines implemented in a hosted computingenvironment. The hosted computing environment may include one or morerapidly provisioned and released computing resources, which computingresources may include computing, networking and/or storage devices. Ahosted computing environment may also be referred to as a cloudcomputing environment.

FIG. 3 and FIG. 4 illustrate example message schemas that can beutilized in accordance with various embodiments. As mentioned, acustomer of a multi-tenant environment might send messages of differentmessage types. The messages can include message fields, e.g., keys thatinclude respective message data, e.g., values, a symbol table thatdescribes the message fields included in the message, and an identifierthat can be used to identify messages of the same message type. As shownin example 300 of FIG. 3, the message includes a symbol table 302. Thesymbol table 302 includes message fields (e.g., symbols, keys, etc.)304. As shown, the message fields 304 for the message include“SomeValue” and “SomeOtherValue.” The respective message field values306 for the message fields include “1” and “SomeString,” respectively.

Using conventional approaches, it can be expensive in terms of time andprocessing power to analyze each message to extract relevant informationfrom the message to deserialize that message. For example, in aconventional approach, when a message is received, the message isanalyzed to identify the message fields and associated message data.This process is repeated for each message received. Accordingly,approaches presented herein provide a framework that identifiessuccessive messages of the same type, determines a protocol for thosemessages, and uses information about the protocol to deserializemessages of the same type without building up a reference structurearound the structure of the message. When a message is received of adifferent type, the protocol for that message type can be determined,and the protocol can be used to deserialize successive messages of thattype.

For example, message 300 can be obtained. It should be noted thatmessage 300 is depicted in an example message schema of an array. Anarray data structure, or array, is a data structure that includes acollection of elements (e.g., values and variables), each identified byat least one array index or key. An array is stored so that the positionof each element can be computed from its index tuple by a mathematicalformula. An example data structure is a linear array, also calledone-dimensional array. Arrays may be used to implement tables, e.g.,lookup tables, string tables, etc. An entry in a string table can be astring. A string is generally understood as a data type and is oftenimplemented as an array of bytes (or words) that stores a sequence ofelements, typically characters, using some character encoding. A stringmay also denote more general arrays or other sequence (or list) datatypes and structures.

In this example, elements of the message include message fields 304 andassociated message field values 306. An entry in a string table caninclude the message fields and associated message field values, messagefield values, or a combination thereof. For example, bytes correspondingto the information can be copied from the message into, for example, astring table, where the information can be cached according to a set ofrules. In accordance with various embodiments, any one of a number ofdata storage and retrieval approaches can implemented. In this example,the entry can include the message fields and associated message fieldvalues.

The entry can be analyzed to determine the position of each element inthe entry. This can include determining the position of the messagefields and respective message data 306 (e.g., message field data)associated with those message fields in the entry. As described, theentry can be an array of bytes. The position of the message fields andrespective message data can be designated by message offset information.In accordance with various embodiments, message offset informationwithin an array or other data structure object can be an integerindicating the distance (displacement) between the beginning of thearray and a given element or point, presumably within the same array. Inthis way, the message offset information can be a value indicating thedisplacement between the beginning of the message to each message fieldof message fields 304 in the array. In an embodiment, data following aparticular message offset can correspond to the message field value(e.g., SomeString) for a corresponding message field (e.g.,SomeOtherValue). In another embodiment, data following a particularmessage offset can correspond a message field (e.g., SomeOtherValue),and data following the message field can correspond the message fielddata (e.g., SomeString) associated with the message field.

As described, messages of the same type are associated with the samemessage offset information. That is, messages of the same type includemessage fields and associated message field data recorded in an array inthe same way. Thus, message offset information, which describes theposition of the message fields and associated message field data in amessage, can be used to quickly obtain message field data for messagesof the same type without having to parse the message.

In certain embodiments, the message can include an identifier. As shownin example 400 of FIG. 4, the message includes identifier 402, messagefields 424, and message field data 426. The identifier 402 can be usedto indicate messages of the same type. As described, messages of thesame type include message fields and associated message field datarecorded in an array in the same way. Thus, message offset information,which describes the position of the message fields and associatedmessage field data, can be used to quickly obtain message field data formessages of the same type. The identifier can be any appropriateidentifier. An example identifier is a hash value. In this example,message fields 424 include “rst” and “lne.” Message field “rst” is partof a shared symbol table, and includes message fields “SomeValue” and“SomeOtherValue.” As described, with respect to FIG. 3, the messagefield values are “1” and “SomeString,” respectively. Message field “lne”is associated with message field value “SomeString.”

In accordance with an embodiment, to determine message offsetinformation to quickly deserialize the message, identifier 402associated with the message can be compared to list of identifiers. Inthe situation where identifier 402 is not included in the list ofidentifiers, the message can be analyzed to determine the message offsetinformation in accordance with the various approaches described herein.In the situation where identifier 402 is included in the list ofidentifiers, message offset information associated with the identifiercan be obtained and the message offset information can be used to readmessage fields 424 and associated message field data 426 message fielddata without having to parse the message.

FIG. 5 illustrates an example process 500 for deserializing messagesthat can be utilized in accordance with various embodiments. It shouldbe understood that for this and other processes discussed herein thatthere can be additional, alternative, or fewer steps performed insimilar or alternate orders, or at least partially in parallel, withinthe scope of the various embodiments unless otherwise specificallystated. In this example, a message is received 502. A message caninclude one or more message fields. Each message field can includemessage data and be associated with a message type. A message type canbe based on the message fields included in a message and an order ofthose message fields in the message. In an embodiment, messages of thesame type include the same message fields in the same order. In variousembodiments, messages of the same type can be determined by anidentifier included in the message, where messages of the same typeinclude matching identifiers. As described herein, such an indicator isnot necessary to determine messages of the same type. In variousembodiments, messages can be analyzed to identity the message fields andthe order of the message fields in the message, and messages thatinclude the same message fields in the same order can be of the samemessage type. In this example, once the message is received, adetermination is made 504 whether the message includes a messageidentifier. In the situation where the message does not include amessage identifier, the message can be analyzed 506 to determinerelevant information associated with message. This can includedetermining message fields and associated message field data and messageoffset information. The message offset information can be utilized todetermine a position in a table and/or message that includes the messagefield data for messages of the same type. An entry in a table thatincludes the relevant information can be generated 508. When asubsequent message of the same type is received 510, the message offsetinformation is used 512 to determine the message field data from thesubsequent message and/or table. A deserialized message that includesthe relevant information associated with the subsequent message isgenerated 514. The subsequent message does not have to be analyzed toidentify the message fields and corresponding message field data forthose message fields. Instead, because the message offset informationhas already been determined for messages of the same type, the messageoffset information can be used to extract the appropriate informationfrom the subsequent message.

In the situation where the message includes a message identifier, theidentifier can be used to retrieve message offset information to locaterelevant information associated with the message which can be used whendeserializing the message. For example, when the message is received,the identifier associated with the message can be compared 516 to listof identifiers. A determination 518 is made whether the identifier isincluded in a list of identifiers. In the situation where the identifieris not included in the list of identifiers, the message can be analyzedto determine the relevant information. In this case, the relevantinformation further includes the identifier. An entry that includes therelevant information (i.e., message fields and associated message fielddata, message offset information, and the identifier) can be generatedin the table. Message offset information can be determined based on thelocation of the message fields in the message. When a subsequent messageis received that includes the same identifier, and thus is of the samemessage type, the message offset information associated with theidentifier can be used to read the message field data from thesubsequent message without having to parse the subsequent message toidentify the message fields to extract the associated message fielddata. Thereafter, a deserialized message that includes the relevantinformation associated with the subsequent message is generated. In thesituation where the identifier is included in the list of identifiers,message offset information associated with the identifier can beobtained. The message offset information can be used to read the messagefield data from the message without having to parse the message toidentify the message fields to extract the associated message fielddata. Thereafter, a deserialized message that includes the relevantinformation associated with the message is generated.

FIG. 6 illustrates an example process 600 for deserializing messagesthat can be utilized in accordance with various embodiments. In thisexample, a first message is received 602, the first message including afirst message field associated with first message data and a first hashvalue generated based at least in part on the first message field. It isdetermined 604 that the first hash value is not included in a table ofhash values. For example, the first hash value can be compared to thetable of hash values. In the situation where the hash value is notpresent, an entry can be generated 606 in the table of hash values thatincludes the first hash value and message offset information thatidentifies a position in the first message that includes the firstmessage data. A second message can be received 608, the second messageincluding a second message field associated with second associatedvalue, and a second hash value. The table can be analyzed 610 todetermine whether the second hash value matches the first hash value. Inthe situation where second hash value does not match the first hashvalue, an entry can be generated 614 in the table that includes thesecond message data and message offset information. Thereafter, adeserialized message that includes at least the second message data canbe generated 616. In the situation where the second hash value matchesthe first hash value, a position in the second message that includes thesecond message data can be determined 618 using the message offsetinformation. The second message data can be obtained 620 and adeserialized message that includes the second message data can begenerated 622.

FIG. 7 illustrates a set of basic components of an example computingdevice 700 that can be utilized to implement aspects of the variousembodiments. In this example, the device includes at least one processor702 for executing instructions that can be stored in a memory device orelement 704. As would be apparent to one of ordinary skill in the art,the device can include many types of memory, data storage orcomputer-readable media, such as a first data storage for programinstructions for execution by the at least one processor 702, the sameor separate storage can be used for images or data, a removable memorycan be available for sharing information with other devices, and anynumber of communication approaches can be available for sharing withother devices. The device may include at least one type of displayelement 706, such as a touch screen, electronic ink (e-ink), organiclight emitting diode (OLED) or liquid crystal display (LCD), althoughdevices such as servers might convey information via other means, suchas through a system of lights and data transmissions. The devicetypically will include one or more networking components 708, such as aport, network interface card, or wireless transceiver that enablescommunication over at least one network. The device can include at leastone input device 710 able to receive conventional input from a user.This conventional input can include, for example, a push button, touchpad, touch screen, wheel, joystick, keyboard, mouse, trackball, keypador any other such device or element whereby a user can input a commandto the device. These I/O devices could even be connected by a wirelessinfrared or Bluetooth or other link as well in some embodiments. In someembodiments, however, such a device might not include any buttons at alland might be controlled only through a combination of visual and audiocommands such that a user can control the device without having to be incontact with the device.

As discussed, different approaches can be implemented in variousenvironments in accordance with the described embodiments. As will beappreciated, although a Web-based environment is used for purposes ofexplanation in several examples presented herein, different environmentsmay be used, as appropriate, to implement various embodiments. Thesystem includes an electronic client device, which can include anyappropriate device operable to send and receive requests, messages orinformation over an appropriate network and convey information back to auser of the device. Examples of such client devices include personalcomputers, cell phones, handheld messaging devices, laptop computers,set-top boxes, personal data assistants, electronic book readers and thelike. The network can include any appropriate network, including anintranet, the Internet, a cellular network, a local area network or anyother such network or combination thereof. Components used for such asystem can depend at least in part upon the type of network and/orenvironment selected. Protocols and components for communicating viasuch a network are well known and will not be discussed herein indetail. Communication over the network can be enabled via wired orwireless connections and combinations thereof. In this example, thenetwork includes the Internet, as the environment includes a Web serverfor receiving requests and serving content in response thereto, althoughfor other networks, an alternative device serving a similar purposecould be used, as would be apparent to one of ordinary skill in the art.

The illustrative environment includes at least one application serverand a data store. It should be understood that there can be severalapplication servers, layers or other elements, processes or components,which may be chained or otherwise configured, which can interact toperform tasks such as obtaining data from an appropriate data store. Asused herein, the term “data store” refers to any device or combinationof devices capable of storing, accessing and retrieving data, which mayinclude any combination and number of data servers, databases, datastorage devices and data storage media, in any standard, distributed orclustered environment. The application server can include anyappropriate hardware and software for integrating with the data store asneeded to execute aspects of one or more applications for the clientdevice and handling a majority of the data access and business logic foran application. The application server provides access control servicesin cooperation with the data store and is able to generate content suchas text, graphics, audio and/or video to be transferred to the user,which may be served to the user by the Web server in the form of HTML,XML or another appropriate structured language in this example. Thehandling of all requests and responses, as well as the delivery ofcontent between the client device and the application server, can behandled by the Web server. It should be understood that the Web andapplication servers are not required and are merely example components,as structured code discussed herein can be executed on any appropriatedevice or host machine as discussed elsewhere herein.

The data store can include several separate data tables, databases orother data storage mechanisms and media for storing data relating to aparticular aspect. For example, the data store illustrated includesmechanisms for storing content (e.g., production data) and userinformation, which can be used to serve content for the production side.The data store is also shown to include a mechanism for storing log orsession data. It should be understood that there can be many otheraspects that may need to be stored in the data store, such as page imageinformation and access rights information, which can be stored in any ofthe above listed mechanisms as appropriate or in additional mechanismsin the data store. The data store is operable, through logic associatedtherewith, to receive instructions from the application server andobtain, update or otherwise process data in response thereto. In oneexample, a user might submit a search request for a certain type ofitem. In this case, the data store might access the user information toverify the identity of the user and can access the catalog detailinformation to obtain information about items of that type. Theinformation can then be returned to the user, such as in a resultslisting on a Web page that the user is able to view via a browser on theuser device. Information for a particular item of interest can be viewedin a dedicated page or window of the browser.

Each server typically will include an operating system that providesexecutable program instructions for the general administration andoperation of that server and typically will include computer-readablemedium storing instructions that, when executed by a processor of theserver, allow the server to perform its intended functions. Suitableimplementations for the operating system and general functionality ofthe servers are known or commercially available and are readilyimplemented by persons having ordinary skill in the art, particularly inlight of the disclosure herein.

The environment in one embodiment is a distributed computing environmentutilizing several computer systems and components that areinterconnected via communication links, using one or more computernetworks or direct connections. However, it will be appreciated by thoseof ordinary skill in the art that such a system could operate equallywell in a system having fewer or a greater number of components than areillustrated. Thus, the depiction of the systems herein should be takenas being illustrative in nature and not limiting to the scope of thedisclosure.

The various embodiments can be further implemented in a wide variety ofoperating environments, which in some cases can include one or more usercomputers or computing devices which can be used to operate any of anumber of applications. User or client devices can include any of anumber of general purpose personal computers, such as desktop or laptopcomputers running a standard operating system, as well as cellular,wireless and handheld devices running mobile software and capable ofsupporting a number of networking and messaging protocols. Such a systemcan also include a number of workstations running any of a variety ofcommercially-available operating systems and other known applicationsfor purposes such as development and database management. These devicescan also include other electronic devices, such as dummy terminals,thin-clients, gaming systems and other devices capable of communicatingvia a network.

Most embodiments utilize at least one network that would be familiar tothose skilled in the art for supporting communications using any of avariety of commercially-available protocols, such as TCP/IP, FTP, UPnP,NFS, and CIFS. The network can be, for example, a local area network, awide-area network, a virtual private network, the Internet, an intranet,an extranet, a public switched telephone network, an infrared network, awireless network and any combination thereof.

In embodiments utilizing a Web server, the Web server can run any of avariety of server or mid-tier applications, including HTTP servers, FTPservers, CGI servers, data servers, Java servers and businessapplication servers. The server(s) may also be capable of executingprograms or scripts in response requests from user devices, such as byexecuting one or more Web applications that may be implemented as one ormore scripts or programs written in any programming language, such asJava®, C, C# or C++ or any scripting language, such as Perl, Python orTCL, as well as combinations thereof. The server(s) may also includedatabase servers, including without limitation those commerciallyavailable from Oracle®, Microsoft®, Sybase® and IBM® as well asopen-source servers such as MySQL, Postgres, SQLite, MongoDB, and anyother server capable of storing, retrieving and accessing structured orunstructured data. Database servers may include table-based servers,document-based servers, unstructured servers, relational servers,non-relational servers or combinations of these and/or other databaseservers.

The environment can include a variety of data stores and other memoryand storage media as discussed above. These can reside in a variety oflocations, such as on a storage medium local to (and/or resident in) oneor more of the computers or remote from any or all of the computersacross the network. In a particular set of embodiments, the informationmay reside in a storage-area network (SAN) familiar to those skilled inthe art. Similarly, any necessary files for performing the functionsattributed to the computers, servers or other network devices may bestored locally and/or remotely, as appropriate. Where a system includescomputerized devices, each such device can include hardware elementsthat may be electrically coupled via a bus, the elements including, forexample, at least one central processing unit (CPU), at least one inputdevice (e.g., a mouse, keyboard, controller, touch-sensitive displayelement or keypad) and at least one output device (e.g., a displaydevice, printer or speaker). Such a system may also include one or morestorage devices, such as disk drives, magnetic tape drives, opticalstorage devices and solid-state storage devices such as random accessmemory (RAM) or read-only memory (ROM), as well as removable mediadevices, memory cards, flash cards, etc.

Such devices can also include a computer-readable storage media reader,a communications device (e.g., a modem, a network card (wireless orwired), an infrared communication device) and working memory asdescribed above. The computer-readable storage media reader can beconnected with, or configured to receive, a computer-readable storagemedium representing remote, local, fixed and/or removable storagedevices as well as storage media for temporarily and/or more permanentlycontaining, storing, transmitting and retrieving computer-readableinformation. The system and various devices also typically will includea number of software applications, modules, services or other elementslocated within at least one working memory device, including anoperating system and application programs such as a client applicationor Web browser. It should be appreciated that alternate embodiments mayhave numerous variations from that described above. For example,customized hardware might also be used and/or particular elements mightbe implemented in hardware, software (including portable software, suchas applets) or both. Further, connection to other computing devices suchas network input/output devices may be employed.

Storage media and other non-transitory computer readable media forcontaining code, or portions of code, can include any appropriate mediaknown or used in the art, such as but not limited to volatile andnon-volatile, removable and non-removable media implemented in anymethod or technology for storage of information such as computerreadable instructions, data structures, program modules or other data,including RAM, ROM, EEPROM, flash memory or other memory technology,CD-ROM, digital versatile disk (DVD) or other optical storage, magneticcassettes, magnetic tape, magnetic disk storage or other magneticstorage devices or any other medium which can be used to store thedesired information and which can be accessed by a system device. Basedon the disclosure and teachings provided herein, a person of ordinaryskill in the art will appreciate other ways and/or methods to implementthe various embodiments.

The specification and drawings are, accordingly, to be regarded in anillustrative rather than a restrictive sense. It will, however, beevident that various modifications and changes may be made thereuntowithout departing from the broader spirit and scope of the invention asset forth in the claims.

What is claimed is:
 1. A system, comprising: at least one processor; andmemory including instructions that, when executed by the at least oneprocessor, cause the system to: receive a serialized message, theserialized message including a first hash value and a message fieldassociated with message data; analyze a table to determine that thefirst hash value matches a second hash value; determine a position inthe serialized message that includes the message data using messageoffset information associated with the second hash value; obtain themessage data; and generate a deserialized data object that includes themessage data.
 2. The system of claim 1, wherein the instructions whenexecuted further cause the system to: receive a second serializedmessage at a client device in a provider environment, the secondserialized message including a plurality message fields in an orderedarrangement; use a hashing technique to generate a third hash valuebased at least in part on the plurality of message fields and theordered arrangement of the plurality of message fields; and update thesecond serialized message to include the third hash value.
 3. The systemof claim 1, wherein the instructions when executed further cause thesystem to: receive a second serialized message, the second serializedmessage including a plurality of message fields in an orderedarrangement; determine a message type associated with the secondserialized message based at least in part on the ordered arrangement ofthe plurality of message fields; determine second message offsetinformation for the message type; and store the second message offsetinformation of the message type.
 4. A system, comprising: at least oneprocessor; and memory including instructions that, when executed by theat least one processor, cause the system to: receive a serializedmessage that includes message data and an identifier, match theidentifier to an entry in a table, the entry including message offsetinformation; determine a position in the serialized message thatincludes the message data using the message offset information; andgenerate a deserialized data object based at least in part on themessage data and the message offset information.
 5. The system of claim4, wherein the message data includes binary data, and wherein theinstructions when executed further cause the system to write the binarydata into a string table of the table.
 6. The system of claim 5, whereinthe instructions when executed to further cause the system to analyzethe string table to determine the message offset information.
 7. Thesystem of claim 4, wherein the instructions when executed further causethe system to: receive a second serialized message at a client device ina provider environment, the second serialized message including a firstdata field type and a second data field type; generate a thirdidentifier based at least in part on the first data field type and thesecond data field type; and update the second serialized message toinclude the third identifier.
 8. The system of claim 7, wherein thethird identifier is further based at least in part on an orderedarrangement of the first data field type and the second data field type.9. The system of claim 4, wherein the instructions when executed furthercause the system to: determine an amount of memory to maintain thetable; determine an amount of processing power to maintain the table;and remove entries in the table based at least in part on the amount ofmemory and the amount of processing power.
 10. The system of claim 4,wherein a number of entries in the table is based at least in part onone of a frequency of receiving a particular identifier, an amount ofmemory to maintain the table, an amount of processing power to maintainthe table, a number of entries in the table, an amount of time to searchfor an entry in the table, or an importance indicator associated withentries in the table.
 11. The system of claim 4, wherein the serializedmessage is one of a JavaScript Object Notation (JSON) message format, anION message format, a protobuf message format, a messagpack messageformat, or a concise binary object representation (CBOR) message format.12. The system of claim 4, wherein the instructions when executedfurther cause the system to: receive a second serialized message thatincludes second message data, a first portion of the second message dataassociated with a first data field type and a second portion of thesecond message data associated with a second data field type; determinean order of the first data field type and the second data field type inthe second serialized message; and match the order to a second entry inthe table that includes second message offset information, the secondmessage offset information used to determine a first position in thesecond serialized message that includes the first portion of the messagedata and a second position in the message that includes the secondportion of the second message data.
 13. The system of claim 12, whereinthe instructions when executed further cause the system to: generate asecond deserialized data object based at least in part on the secondmessage data and the second message offset information.
 14. The systemof claim 4, wherein the serialized message includes a key-value pair andan ordered list of values, and wherein the key-value pair is realized asone of an object, a record, a struct, a dictionary, a hash table, akeyed list, or an associative array, and wherein the ordered list ofvalues is realized as one of an array, a vector, a list, or a sequence.15. A computer-implemented method, comprising: receiving a serializedmessage that includes message data and an identifier, matching theidentifier to an entry in a table, the entry including message offsetinformation; determining a position in the serialized message thatincludes the message data using the message offset information; andgenerating a deserialized data object based at least in part on themessage data and the message offset information.
 16. Thecomputer-implemented method of claim 15 further comprising: receiving asecond serialized message that includes second message data, a firstportion of the second message data associated with a first data fieldtype and a second portion of the second message data associated with asecond data field type; determining an order of the first data fieldtype and the second data field type in the second serialized message;matching the order to a second entry in the table that includes secondmessage offset information, the second message offset information usedto determine a first position in the second serialized message thatincludes the first portion of the second message data and a secondposition in the second serialized message that includes the secondportion of the second message data; and generating a second deserializeddata object based at least in part on the second message data and thesecond message offset information.
 17. The computer-implemented methodof claim 15, further comprising: determining an amount of memory tomaintain the table; determining an amount of processing power tomaintain the table; and removing entries in the table based at least inpart on the amount of memory and the amount of processing power.
 18. Thecomputer-implemented method of claim 15, wherein the first message dataincludes binary data, the method further comprising: writing the binarydata into a string table of the table; and analyzing the string table todetermine the message offset information.
 19. The computer-implementedmethod of claim 15, wherein the serialized message is one of aJavaScript Object Notation (JSON) message format, an ION message format,a protobuf message format, a messagpack message format, or a concisebinary object representation (CBOR) message format.
 20. Thecomputer-implemented method of claim 15, wherein the serialized messageincludes a key-value pair and an ordered list of values, and wherein thekey-value pair is realized as one of an object, a record, a struct, adictionary, a hash table, a keyed list, or an associative array, andwherein the ordered list of values is realized as one of an array, avector, a list, or a sequence.